Evaluation of Natural Language Interfaces to Data Base Systems

نویسنده

  • Bozena Henisz Thompson
چکیده

I s e v a l u a t i o n , l i k e b e a u t y , i n the eye of the beho lde r? The answer i s f a r from s imple because i t depends on who i s c o n s i d e r e d to be the proper b e h o l d e r . E v a l u a c o r s may range from c a s u a l u s e r s to s o c i e t y as a whole , w i t h s y s tem builders, sophisticated users, linguists, grant providers, sys tem buye r s , and o t h e r s i n be tween. The members of t h l s pane l a r e sys tem b u i l d e r s and l i n g u i s t s or r a t h e r the t~ao fused i n t o one b u t , I b e l i e v e , i n t e r e s t e d i n a l l or a lmos t a l l a c t u a l or p o t e n t i a l b o d i e s of e v a l u a t o r s . One of our c o l l e a g u e s e x p r e s s e d a f o r c e f u l o p i n i o n w h i l e be ing a member of a s i m i l a r pane l a t l a s t y e a r ' s ACL c o n f e r e n c e : "Those of us on t h i s pane l and o t h e r r e s e a r c h e r s i n the f i e l d s imply d o n ' t have the r i g h t to d e t e r m i n e whe the r a sys tem i s p r a c t i c a l . Only the u s e r s of such a sys tem can make Chat d e t e r m i n a t i o n . Only a u s e r can d e c i d e whe the r the hi. [ n a t u r a l l anguage] c a p a b i l i t y c o n s t i t u t e s s u f f i c i e n t added v a l u e to be deemed p r a c t i c a l Only a u s e r can d e c i d e i f the s y s t e m ' s f r equency of i n a p p r o p r i a t e r e s p o n s e i s s u f f i c i e n t l y low to be deemed p r a c t i c a l . Only a u s e r can dec ide whether the o v e r a l l NL i n t e r a c t i o n , t aken in t o t o , o f f e r s enough b e n e f i t s over a l t e r n a t i v e fo rmal i n t e r a c t i o n s to be deemed p r a c t i c a l " I l l . I t i s hard f o r me co d i s a g r e e , s i n c e I a rgued as f o r c e f u l l y on the b a s i s of my s tudy of use r s* e v a l u a t i o n of machine t r a n s l a t i o n [2] a s tudy which was prompted by the e v a l u a t i o n s of the q u a l i t y of machine t r a n s l a t i o n as viewed by l i n g u i s t s and u s e r s , r a n g i n g from 35Z a c c e p t a b l e f o r the former to 90Z fo r the l a t t e r . Whet the s tudy a l s o showed was cha t the p r a c t i c a l i t y of the o u t put cou ld indeed on ly be judged by the u s e r s , s i n c e even i ncomple t e and s t y l i s t i c a l l y v e r y i n e l e g a n t t r a n s l a t i o n s were found q u i t e u s e f u l in p r a c t i c e because t h e y , on the one hand, p r o v i d e d , however c r u d e l y , the i n f o r m a t i o n sought by the u s e r s , and, on the o t h e r hand, the u s e r s t hemse lves b rought knowledge chat made the t e x t s f a r more u n d e r s t a n d a b l e and u s e f u l then might appear co a n o n s p e c i a l i s t l i n g u i s t . But t h i s endorsement on mY p e r t of the use r a~ the u l t i m a t e judge i n e v a l u a t i o n s does not p r e c l u d e my f u l l y s u b s c r i b i n g co Norm Sondhe imer ' s [3] i n t r o d u c t o r y c o ~ e n t s co t h i s pane l s t a t i n g t h a t to "make p r o g r e s s as a f i e l d , we need to be a b l e Co e v a l u a t e . " We a re now l e s s l i k e l y co confuse the i s s u e of the e v a l u a t i o n by people l i k e o u r s e l v e s and the judgment of the u s e r s , l e s s l i k e l y to be s u r p r i s e d a t the d i s c r e p a n c i e s , and less likely to be surprised at the users" acceptance of the limitations of our NL interfaces. Also, we are far more aware of the fact chac evaluations of '~orth" or "quality" have Co be conducted in the cont e x t s of the a c t u a l , p e r c e i v e d needs . Zn e x t e n s i v e s t u d i e s on e v a l u a t i o n of i n n o v a t i o n s , M o s t e l l e r [ 4 ] , the r e c e n t l y r e t i r e d p r e s i d e n t of AAAS, found t h a t "successf u l i n n o v a t o r s b e t t e r u n d e r s t a n d u s e r needs ; [and] pay more a t t e n t i o n to m a r k e t i n g . . . . " The same s o u r c e , however, l e ads me co the n o t o r i o u s d i f f i c u l t i e s of e v a l u a t i o n g i v e n the v i d e range of eva luaCors and t h e i r p u r p o s e s . We a r e a l l undoubted ly convinced of the v a l u e of NLI fo r the s o c i e t y as a whole , but the e v a l u a t i o n of expe r imen t s w i t h t h e s e i n t e r f a c e s i s a no the r m a t t e r . M o s c e l l e r was faced w i t h s o c i a l , s o c i o m e d i c a l , and medi c a l f i e l d s . Let me r ecoun t some of the s t u d i e s he and h i s team made f o r r e a s o n s which w i l l soon become o b v i ous . His teem scored a g i v e n program on a s c a l e from plus ~wo Co minus ~wo with zero meaning there was essentially uo g a i n . A c c o r d i n g l y , a s tudy of d e l i n q u e n t girls that identified th ~buc failed to prevent them from d e l i n q u e n c y r e c e i v e d a z e r o . L i k e w i s e , a zero was assigned Co a probation experiment for conviction for public drunkenness in which three methods were used: (I) no treatment, (2) an alcoholic clinic, and (3) A l c o h o l i c s Anonymous. S ince the "no t r e a t m e n t " group performed somewhat b e t t e r , s h o r t t e r m r e f e r r a l s were c o n s i d e r e d of no v a l u e . A minus one was g i v e n to a s tudy whose r e s u l t s were o p p o s i t e co those hoped f o r : a major i n s u r a n c e cOmpany i n c r e a s e d o u t p a t i e n t b e n e f i t s i n the hope of d e c r e a s i n g h o s p i t a l c o s t s , but the o u t p a t i e n t g r o u p ' s h o s p i t a l s t a y s i n c r e a s e d . F i n a l l y , a doub l e p l u s was swarded to an e x p e r i m e n t i n v o l v i n g the Sa lk v a c c i n e , which was, p r e d i c t a b l y , ve ry s u c c e s s f u l . Now t h i s k ind of e v a l u a t i o n may be j u s t i f i e d when the needs of the s o c i e t y a re a t s t a k e . I have gone i n t o t h e s e d e t a i l s , however , f o r the purpose of e x p r e s s i n g the o p i n i o n , i n which I know I ' m not a l o n e , t h a t n e l a t i v e r e s u l t s a r e as i m p o r t a n t as p o s i t i v e ones , t h a t e v a l u a t i o n i n our case i s a lmos t e q u i v a l e n t to the amount of i n f o r m a t i o n o b t a i n e d in an e x p e r i m e n t . An expe r imen t whose r e s u l t s would be t o t a l l y p r e d i c t a b l e would be a lmos t u s e l e s s , but one w i t h r e s u l t s d i f f e r e n t frOm those hoped fo r migh t be e m b a r r a s s i n g but v e r y v a l u a b l e . Another c ~ e n t prompted by those e v a l u a t i o n s i s cha t the application of any rigid, fine scale is totally inappropriate in the case of NLI evaluations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Natural Language Planning Dialogue for Intelligent Applications

The goal of this project is to develop the underlying technologies for spoken dialogue systems that serve as interfaces to complex, state-of-the-art reasoning systems. Most current speech and natural language projects are focusing on applications that involve very little intelligent reasoning, such as data-base query and formfilling. However, the great promise for speech and natural language in...

متن کامل

Natural Language Planning Dialogue for Interactive

The goal of this project is to develop the underlying technologies for spoken dialogue systems to serve as highly interactive interfaces to AI-based reasoning systems. Most current speech and natural language projects are focusing on applications that involve only limited dialog, and little intelligent reasoning, such as data-base query and form-filling applications. But the great promise for s...

متن کامل

Chatbots: Can They Serve as Natural Language Interfaces to Qa Corpus?

A chatbot is a program which can chat in natural language, on a topic built into the chatbot’s internal knowledge model. Many chatbots exist, with different knowledge-bases programmed by the chatbot builders. We have built a system to convert a website text (corpus) to a chatbot knowledge-base format. In this paper the chatbot is used as a question answer interface, where TRE09 QA track is used...

متن کامل

CERC Technical Report Series Technical Memoranda CERC-TR-TM-92-001 KNOWLEDGE-DIRECTED GRAPHICAL AND NATURAL LANGUAGE INTERFACE WITH A KNOWLEDGE-BASED CONCURRENT ENGINEERING ENVIRONMENT

Effective on-line communication between cooperating users with the mediation of machines (Virtual Team concept) is a necessity in new-generation concurrent engineering design and manufacture systems. The communication means for solving this problem are natural language and graphics. Managing the graphical and natural language user interfaces is a task full of ill-structured problems for which t...

متن کامل

Web-Based Interfaces For Natural Language Processing Tools

We have built web interfaces to a number of Natural Language Processing technologies. These interfaces allow students to experiment with different inputs and view corresponding output and inner workings of the systems. When possible, the interfaces also enable the student to modify the knowledge bases of the systems and view the resulting change in behavior. Such interfaces are important becaus...

متن کامل

Isolating Domain Dependencies In Natural Language Interfaces

Isolating the domain-dependent information within a large natural language system offers the general advantages of modular design and greatly enhances the portability of the system to new domains. We have explored the problem of isolating the domain dependencies within two large natural language systems, one for generating a tabular data base from text ("information formatting"), the other for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1981